(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
23 February 2006 (23.02.2006) 




PCT 



(10) International Publication Number 

WO 2006/018056 Al 



(51) International Patent Classification : H04N 1/62, 

G06T 7/00, G06K 9/00 

(21) International Application Number: 

PCT/EP2005/005907 

1 June 2005 (01.06.2005) 
English 
English 



(22) International Filing Date: 

(25) Filing Language: 

(26) Publication Language: 



(30) Priority Data: 

10/919,226 



16 August 2004 (16.08.2004) US 



(71) Applicant (for all designated States except US): FOTO- 
NATION VISION LIMITED [IE/IE]; Galway Business 
Park, Dangen, Galway (IE). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): DELUCA, Michael, 

J. [US/US]; 734 Camino Gardens Lane, Boca Raton, FL 
33432 (US). STEINBERG, Eran [IL/US]; 137 Granville 
Way, San Francisco, CA 94127 (US). PRILUTSKY, 



Yurk [US/US]; 1626 Shoal Drive, San Mateo, CA 94404 
(US). CORCORAN, Peter [IE/IE]; Cregg, Claregalway, 
Galway (IE). BIGIOI, Petronel [RO/IE]; 57 Sceilg Ard, 
Headford Road, County Galway (IE). 

(74) Agents: BOYCE, Conor et al.; F. R. Kelly & Co., 27 
Clyde Road, Ballsbridge, Dublin 4 (IE). 

(81) Designated States ( unless otherwise indicated, for every 
kind of national protection available): AE, AG, AL, AM, 
AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, 
CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, 
GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, 
KG, KM, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, 
MD, MG, MK, MN, MW, MX, MZ, NA, NG, NI, NO, NZ, 
OM, PG, PH, PL, PT, RO, RU, SC, SD, SE, SG, SK, SL, 
SM, SY, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, 
VN, YU, ZA, ZM, ZW. 

(84) Designated States ( unless otherwise indicated, for every 
kind of regional protection available): ARIPO (BW, GH, 
GM, KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM, 
ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FT, 

[ Continued on next page J 



(54) Title: METHOD AND APPARATUS FOR DETECTING AND CORRECTING RED-EYE EFFECT 



s'o / / 520 7 ,- 

/ Load Acquired / I Load Preview ! _J 

I tnaoe { I Imaged T 



H/aich size between mages 



(57) Abstract: A digital image acquisition system having no photographic 
film, such as a digital camera, has a flash unit for providing illumination dur- 
ing image capture and a red-eye filter for detecting a region within a captured 
image indicative of a red-eye phenomenon, the detection being based upon 
a comparison of the captured image and a reference image of nominally the 
same scene taken without flash. In the embodiment the reference image is 
a preview image of lower pixel resolution than the captured image, the fil- 
ter matching the pixel resolutions of the captured and reference images by 
up-sampling the preview image and/or sub-sampling the captured image. 
The filter also aligns at least portions of the captured image and reference 
image prior to comparison to allow for, e.g. movement in the subject. 




WO 2006/018056 Al I lllll llllllll II llllll Hill Hill lllll llll I II III lllll lllll lllll Hill lllll INI lllllll llll llll llll 



FR, GB, GR, HU, IE, IS, IT, LT, LU, MC, NL, PL, PT, RO, 
SE, SI, SK, TR), OAPI (BF, B J, CF, CG, CI, CM, GA, GN, 
GQ, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— with international search report 



For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



WO 2006/018056 



PCT/EP2005/005907 



METHOD AND APPARATUS FOR DETECTING AND CORRECTING RED-EYE EFFECT 

FIELD OF THE INVENTION 

The invention relates generally to the area of digital photography, and more specifically to 
filtering "red-eye" artefacts from a flash-induced digital camera image. 

BACKGROUND OF THE INVENTION 

"Red-eye" is a phenomenon in flash photography where a flash is reflected within a subject's 
eye and appears in a photograph as a red dot where the black pupil of the subject's eye would 
normally appear. The unnatural glowing red of an eye is due to internal reflections from the 
vascular membrane behind the retina, which is rich in blood vessels. This objectionable 
phenomenon is well understood to be caused in part by a small angle between the flash of the 
camera and the lens of the camera. This angle has decreased with the miniaturization of 
cameras with integral flash capabilities. Additional contributors include the relative 
closeness of the subject to the camera and ambient light levels. 

The red-eye phenomenon can be minimized by causing the iris to reduce the opening of the 
pupil. This is typically done with a "pre-flash", a flash or illumination of light shortly before 
a flash photograph is taken. This causes the iris to close. Unfortunately, the pre-flash is an 
objectionable 0.2 to 0.6 seconds prior to the flash photograph. This delay is readily 
discernible and easily within the reaction time of a human subject. Consequently the subject 
may believe the pre-flash is the actual photograph and be in a less than desirable position at 
the time of the actual photograph. Alternately, the subject must be informed of the pre-flash, 
typically loosing any spontaneity of the subject captured in the photograph. 

Those familiar with the art have developed complex analysis processes operating within a 
camera prior to invoking a pre-flash. Various conditions are monitored prior to the 
photograph before the pre-flash is generated; the conditions include the ambient light level 
and the distance of the subject from the camera. Such a system is described in U.S. Pat. No. 
5,070,355. Although that invention minimizes the occurrences where a pre-flash is used, it 
does not eliminate the need for a pre-flash. 
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Digital cameras are becoming more popular and smaller in size. Digital cameras have several 
advantages over film cameras. Digital cameras eliminate the need for film as the image is 
digitally captured and stored in a memory array for display on a display screen on the camera 
itself. This allows photographs to be viewed and enjoyed virtually instantaneously as 
opposed to waiting for film processing. Furthermore, the digitally captured image may be 
downloaded to another display device such as a personal computer or color printer for further 
enhanced viewing. Digital cameras include microprocessors for image processing and 
compression and camera systems control. It is possible to exploit the computation 
jcapahilitLesjodLsBch 

red-eye phenomenon within, for example, a digital camera having a flash unit without the 
distraction of a pre-flash. 

U.S. Patent Application 2002/0150306 (Baron) described a method for the removal of flash 
artefacts by capturing two digital images of a subject, one with flash and one without flash, 
and subtracting one image from the other to provide an artefact image which is then 
thresholded and subtracted from the flash image. However, the technique is directed to flash 
artefacts in general, and not specifically to red-eye removal. There is no attempt to identify 
red-eye regions as compared to any other flash-induced artefacts. Indeed, there is no attempt 
to identify particular regions at all, since the technique is simply one of subtraction and 
thresholding. 

BRIEF SUMMARY OF THE INVENTION 

According to the present invention there is provided a digital image acquisition system 
having no photographic film, comprising a portable apparatus for capturing digital images, a 
flash unit for providing illumination during image capture, and a red-eye filter for detecting a 
region within a captured image indicative of a red-eye phenomenon, said detection being 
based upon a comparison of said captured image and a reference image of nominally the 
same scene taken without flash. 

In an embodiment of the invention the reference image is a preview image of lower pixel 
resolution than the captured image, the filter including means for matching the pixel 
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resolutions of the captured and reference images by up-sampling the preview image and/or 
sub-sampling the captured image. 

To allow for inadvertent movement in the subject between taking the two images, preferably 
the filter further including means for aligning at least portions of the captured image and 
reference image prior to said comparison. 

In the embodiment the filter detects said region indicative of a red-eye phenomenon by 

identifying axegiojiin Jhe^capturedJum 

=phen^enon-^ 

reference image, the filter further designating said region as indicative of a red-eye 
phenomenon if said corresponding region does not have a colour indicative of a red-eye 
phenomenon. The decision as to whether a region has a colour indicative of a red-eye 
phenomenon is determined on a statistical basis as a global operation on the entire region. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of a camera apparatus operating in accordance with an embodiment 
of the present invention. 

FIG. 2 illustrates the workflow of the initial stage of a red-eye filter using preview data 
according to the embodiment. 

FIGS. 3-a to 3-d illustrates the pixelation process of an image in different resolutions. 

FIG 3-e is a enlargement of a hypothetical digitization of an eye in an image. 

FIG 4-a illustrates the pixel differences between a red-eye image and a non red-eye image. 

FIGS. 5-a to 5-d and 6-a and 6-b illustrate the detailed workflow of the red-eye filter 
according to the embodiment, and alternatives. 

DESCRIPTION OF A PREFERRED EMBODIMENT 
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FIG. 1 shows a block diagram of a image acquisition system such as a digital camera 
apparatus operating in accordance with the present invention. The digital acquisition device, 
also generically referred to in this application as a camera 20, includes a processor 120. It 
can be appreciated that many of the processes implemented in the digital camera may be 
implemented in or controlled by software operating in a microprocessor (juProc), central 
processing unit (CPU), controller, digital signal processor (DSP) and/or an application 
specific integrated circuit (ASIC), collectively depicted as block 120 and termed as 
"processor". Generically, all user interface and control of peripheral components such as 
_ -buttons-anidisplay^ 

controls the digital photographic process. Ambient light exposure is determined using light 
sensor 40 in order to automatically determine if a flash is to be used. The distance to the 
subject is determined using focusing means 50 which also focuses the image on image 
capture means 60. If a flash is to be used, processor 120 causes the flash means 70 to 
generate a photographic flash in substantial coincidence with the recording of the image by 
image capture means 60 upon full depression of the shutter button. The image capture means 
60 digitally records the image in color. The image capture means is known to those familiar 
with the art and may include a CCD (charge coupled device) or CMOS to facilitate digital 
recording. The flash may be selectively generated either in response to the light sensor 40 or 
a manual input 72 from the user of the camera. The image recorded by image capture means 
60 is stored in image store means 80 which may comprise computer memory such a dynamic 
random access memory or a non- volatile memory. The camera is equipped with a display 
100, such as an LCD, for preview and post-view of images. In the case of preview images, 
which are generated in the pre-capture mode 32, the display 100 can assist the user in 
composing the image, as well as being used to determine focusing and exposure. In case of 
postview, the image display can assist the user in viewing suspected red-eye regions and to 
manually decide if the region should be corrected or not after viewing it. A temporary 
storage space 82 is used to store one or plurality of the preview images and be part of the 
image store means 80 or a separate component. The preview image is usually generated by 
the same image capture means 60, and for speed and memory efficiency reasons may be 
generated by subsampling the image 124 using software which can be part of the general 
processor 120 or dedicated hardware, before displaying 100 or storing 82 the preview image. 
Depending on the settings of this hardware subsystem, the pre-acquisition image processing 
may satisfy some predetermined criteria prior to storing the preview image. Such criteria 
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may be chronological - such as save images every 0.5 seconds; more sophisticated criteria 
may be analysis of the image for changes, or the detection of faces in the image. A 
straightforward preferred embodiment is to constantly replace the previous saved preview 
image with a new captured preview image during the pre-capture mode 32, until the final full 
resolution image is captured by full depression of the shutter button.. 

The red-eye filter 90 can be integral to the camera 20 or part of an external processing device 
10 such as a desktop computer, a hand held device, a cell phone handset or a server. In this 
embodiment, the filter receives the captured image from the full resolution image storage80_ 

" as ~^ 

analyzes the stored image for characteristics of red-eye and, if found, modifies the image and 
removes the red-eye phenomenon from the image as will be describe in more detail. The red- 
eye filter includes a pixel locator 92 for locating pixels having a color indicative of red-eye; a 
shape analyzer 94 for determining if a grouping of at least a portion of the pixels located by 
the pixel locator comprise a shape indicative of red-eye; an falsing analyzer 96 for processing 
the image around the grouping for details indicative of an image of an eye ; and a pixel 
modifier 98 for modifying the color of pixels within the grouping. The modified image may 
be either displayed on image display 100, saved on a persistent storage 112 which can be 
internal or a removable storage such as CF card, SD card or the like, or downloaded to 
another device, such as a personal computer, server or printer via image output means 1 10 
which can be tethered or wireless. 

In a further embodiment where the red-eye filter 90 is located on an external application in a 
separate device, 10, such as a desktop computer, the final captured image stored in block 80 
along with a representation of the preview image as temporarily stored in 82, may be stored 
prior to modification on the storage device 1 12, or transferred together via the image output 
means 1 10 onto the external device 10, later to be processed by the red-eye filter 90. 

Figure 2 details the initial stage of the workflow of this embodiment. It will be understood 
both this initial stage as well as the subsequent red-eye correction stage (Fig. 5) will typically 
be performed by software in the camera and/or separate device 10. A preview image 
(normally of lesser resolution than the final image) is generated while the camera is in the 
pre-capture mode 32 such as when the user half presses the shutter button. While in this 
mode, shown in Fig. 2 as the preview mode 210, the camera constantly captures the preview 
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images 220. The capture interval is usually semi-real time which means fractions of a tenth 
of a second or less. The camera saves each new preview image if it satisfies some test 
criteria, 122. If not, the camera continues, 21 1, to capture the next preview image without 
saving the previous one. The process will continue until the final full resolution image is 
acquired 280 and saved 282 by fully depressing the shutter button. 

In a simple embodiment, if the test criteria are met, or if no test criteria exist, the system will 
constantly replace the previous saved preview image with the new preview image, 230. 
Alternatively, where multiple preview images can be saved, 240, the new iiiiagejvill be 
=pteced^^ 

continuously captures and saves new preview images 244 while each time clearing the oldest 
image 242 from the stack, until the user takes the final picture. The reason for storing 
multiple preview images is based on the fact that the last image, or any single image, may not 
be the best reference image for comparison with the final full resolution image in the red-eye 
correction process. By storing multiple images, a better reference image can be achieved, 
and a closer alignment between the preview and the final captured image can be achieved. 
This concept will be further discussed in Figure 5, in the alignment stage 540. Other reasons 
for capturing multiple images are that a single image may be blurred due to motion, the 
subject had their eyes closed, the exposure was not set, etc. In a yet alternative embodiment, 
the multiple images may assist in creating a single higher quality reference image; either 
higher resolution or by taking different portions of different regions from the multiple 
images. This concept of sub-pixel resolution may be combined with the upsampling process 
as described in Figure 5, block 534. 

The test criteria 222 may involve the actual analysis of the preview image content before 
deciding whether the new preview image should replace a previously saved image. Such 
criteria may be based on image analysis such as the existence of faces in the image, detection 
of eyes or metadata analysis such as the exposure condition, whether a flash is going to 
happen, the distance to the subjects, etc. 

As part of the red-eye filter 90 the full resolution image 292 and the preview image or images 
294 will be loaded into working memory, 292 and 294, unless they are already in memory in 
which they will just be accessed through a pointer. 
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Referring to Figures 3a-3e, the digitization process in various resolutions is explained and 
depicted. Figure 3 a illustrates the grid like nature of a sensor as illustrated in Figure 1, block 
60. The sensor comprises multiple cells 302 which determine the camera resolution. For 
example a sensor of 2000x3000 cells will be a 6 Million pixel sensor (it will be understood 
that each cell in fact comprises a plurality of individual sensor elements sensitive to different 
colours, e.g. RGB or RGBG, to create each coloured image pixel). 

Figure 3-b depicts the optical projection of a face 310 onto the sensor 60. Each of the cells 
302 records the average light information it receives for the image. This is the process of 
diffltization arid q uaMzation . 

The degree of details is determined by the resolution of the sensor as depicted in Figure 3-c. 
In this illustration a much smaller sensor is used, and in this case the same face 310 is 
digitized into a smaller number of pixels, or alternatively subsampled from the full resolution 
sensor data into a smaller number of pixel representations. 

Figure 3-d describes the inverse process where the subsampled image of Figure 3-c is 
upsampled to the same size as the original. When comparing the resolution, naturally the 
some of the details are lost in this process. For illustrative example, while in Figure 3-b the 
face was roughly 25x25=625 pixels, in Figure 3-d the face is made of only 5x5=25 pixels. 

Of course, the above are only for illustration purposes. In practice, due to the larger 
resolution of the sensors than in this illustration, a normal eye will be depicted by a much 
larger pixel count to be noticeable. Figure 3-e displays such a digitized eye. In this figure, an 
eye 350, as imaged on a sensor 60, will consist of roughly 25 pixels wide, 352. In particular 
interest for this invention the inner portion, the iris 360, in this case will be roughly 8 pixels 
in diameter, as illustrated in 462. 

According to a preferred embodiment of this invention, the preview image and the final 
image, or portions of them, need to be aligned as depicted in Figure 5, block 540. As 
explained above, the reference image and the final image may have different resolutions. 
The discrepancy in resolution may lead to differences in content, or pixel values, even though 
no data was changed in the subject image. In particular, edge regions when downsampled 
and then upsampled may have a blurring or an averaging effect on the pixels. , Thus direct 
comparison of different resolution images, even when aligned, may lead to false contouring. 
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In addition, the reference image may be acquired prior to or after the final image is captured. 
Due to the above reasons, there is a need to match the two images, both in content and pixel 
resolution, as described below. 

Figure 4 better illustrates the effect of the sub- and up-sample process in finding the 
difference pixelwise between two images. In this case, the input images are the ones 
illustrated in Figures 3-b and 3-d respectively high resolution and low resolution. In this 
figure, white squares such as 430 means that there is no difference between the two. 
Checkered squares or pixels, such as 420 means that there is a difference between the images. 

The flat regions should display no significant differences due to resolution changes. The 
main difference will be caused be two reasons namely edge regions where changes in value 
occur such as in blocks 410. However, there is another cause for difference which is of 
interest to this invention and displayed in 430. In these pixels, the difference is caused by the 
actual change in the color of the eye from normal eyes to red-eyes. Not only is there a 
change in the pixel value but the change is also more specifically reflected as change to a red 
or light color from the normal color of the iris or form the black color of the pupil. 

Figures 5-a to 5-d illustrate the workflow of the red-eye filter 90 of this embodiment, as well 
as variations thereof. 

Referring first to Fig. 5-a, there are two input images into the filter, namely a full resolution 
image 510, 1(x,y) which is the one that was captured by full depression of the shutter button 
and needs to be analyzed for red-eye artefacts, and a preview image 520, P(x,y) which is used 
as a reference image and is nominally the same scene as the image I(x,y) but taken without 
the flash. The preview image may be a result of some image processing taking into account 
multiple preview images and creating a single image, 522. Methods of improving image 
quality based on multiple images are familiar to those versed in the art of image processing. 
The resulting output from the analysis process of 522 is a single preview image. 

The preview image 520 is normally, but not necessarily, of lower resolution than the full 
resolution image 510, typically being generated by clocking out a subset of the sesnor cells or 
by averaging the raw sensor data. Therefore, the two images, or alternatively the relevant 
regions in the images (i.e. the regions containing or suspected to contain eyes, which can be 
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determined by image processing techniques known in the art), need to be matched in pixel 
resolution, 530. In the present context "pixel resolution" means the size of the image, or 
relevant region, in terms of the number of pixels constituting the image or region concerned. 
Such a process may be done by either upsampling the preview image, 534, downsampling the 
acquired image, 532, or a combination thereof. Those familiar in the art will be aware of 
several techniques best used for such sampling methods. The result of step 530 is a pair of 
images I f (x,y) and F(x,y) corresponding to the original images I(x,y) and P(x,y), or relevant 
regions thereof, with matching pixel resolution. The system and method of the preferred 
enAodim^tinwlves the_ ctetectio^ of 

theTed~eyewll~even 

portions of the detection of red-eye candidate pixel groupings, the subsequent testing of said 
pixel groupings for determining false red-eye groupings, and the initial step of the removal, 
where the image is presented to the user for user confirmation of the correction, can be 
performed on the entire image, the subsampled image, or a subset of regions of the entire 
image or the subsampled image. 

Although nominally of the same scene, the preview image and the finally acquired full 
resolution image may differ spatially due to the temporal lag between capturing the two 
images. Therefore, the two images, or relevant regions thereof, may need to be aligned, 540, 
especially in respect of regions of the images containing or suspected to contain eyes. 
Essentially, alignment means transforming at least one of the images, and in this embodiment 
the preview image P'(x,y), to obtain maximum correlation between the images, or relevant 
regions thereof, based on measurable characteristics such as colour, texture, edge analysis. 
Those familiar in the art are aware of several algorithms to achieve such alignment; see, for 
example, US 6295367 which describes alignment of images due to object and camera 
movement and US 5933546 which addresses the use of multi-resolution data for pattern 
matching. 

Further discussion on the alignment is presented in Figure 5~c. In this Figure, the inputs are 
the two images T(x,y) and F(x,y) as defined in Figure 5-a. The alignment may be global for 
the entire image or local for specific regions. For example, a simple linear alignment, such as 
a shift in the horizontal direction by H pixels, and/or in the vertical direction by V pixels, or a 
combination of the two. Mathematically, the shifted image, P"(x,y), can be described as: 
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P"(x 9 y) = P\x-H 9 y-V) 

However, simple translation operation may not suffice in the need to align the image. 
Therefore, there may be a need for X-Y shearing, which is a symmetrical shift of the object's 
points in the direction of the axis to correct for perspective changes; X-Y tapering where the 
object is pinched by shifting its coordinates towards the axis, the greater the magnitude of the 
coordinate the further the shift; or rotation around an arbitrary point. 

- In general, the alignment process may involve an-affine transformation, defined as a special 
class of projective transformations that do not move any objects from the affine space to 
the plane at infinity or conversely, or any transformation that preserves collinearity (i.e. all 
points lying on a line initially still lie on a line after transformation) and ratios of distances 
(e.g., the midpoint of a line segment remains the midpoint after transformation). Geometric 
contraction, expansion, dilation, reflection, rotation, shear, similarity transformations, spiral 
similarities and translation are all affine transformations, as are their combinations. In 
general, the alignment 540 may be achieved via an affine transformation which is a 
composition of rotations, translations, dilations, and shears, all well-known to one familiar in 
the art of image processing. 

If it is determined through a correlation process that a global transformation suffices, as 
determined in block 542 =YES, one of the images, and for simplicity the preview image, will 
undergo an affine transformation, 544, to align itself with the final full resolution image. 
Mathematically, this transformation can be depicted as: 

p" = AP+q 

where A is a linear transformation and q is a translation. 

However, in some cases a global transformation may not work well, in particular for cases 
where the subject matter moved, as could happen when photographing animated objects. In 
such case, in particular in images with multiple human subjects, and when the subjects move 
in independent fashion, the process of alignment 540 may be broken down, 546, to numerous 
local regions each with its own affine transformation. What is important is to align the eyes 
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between the images. Therefore, according to this alternative, one or multiple local 
alignments may be performed, 548, for regions in the vicinity surrounding the eyes, such as 
faces. 

Only after the images are aligned can one compare the potential red-eye colors. 

In the preferred embodiment of Figure 5-a, the preview image information is used as part of 
the falsing stage 96. Blocks 92, 94 and 98 correspond to the same blocks in Figure 1, being 
the stages of pixel locator, shape analyzer and pixel modification respectively. This 
- "embodimentxani ncor porate^ 

described in US 6,407,777 (DeLuca), the functions of the pixel locator 92 and shape analyzer 
94 being performed on the image r<x,y) and the pixel modifier 98 operating on the original 
acquired image I(x,y). Block 96, which is the falsing stage, is improved in this embodiment 
as compared to the falsing stage of DeLuca. 

Referring to block 96, for each region of the image I'(x,y) suspected as red-eye, step 596-2, 
as identified by steps 92 and 94, the suspected region is tested by comparing the pixel values 
of the region with the pixel values of the corresponding region in the aligned preview image 
P"(x,y), 596-6. However, prior to doing so, the regions need to be prepared and modified for 
such comparison, 596-4. 

Due to the fact that the regions may not match exactly, a pixel-by-pixel comparison may not 
suffice. The reason for the mismatch may occur due to the original size discrepancy. For 
example, in edges this phenomenon is graphically illustrated in Figure 4. Other reasons for a 
mismatch are potential movement of the object, or there may be some averaging that is done 
in the low resolution preview image that may loose high frequency color data. Such effects 
are referred to as smoothing and aliasing. In addition, even if the alignment is optimal, there 
may be sub-pixel alignment that can not be accounted for. Moreover, there may be color 
differences between the preview image, shot using available light and the acquired full 
resolution image which is shot using flash. In many cases, the color transformation between 
one image to another is not global and uniform. Therefore, the process of preparing the 
regions for comparison. 
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This process as illustrated in block 596-4 will be further described in Figure 5-d. The 
underlying concept behind step 596-4 is to distinguish between differences that are caused 
due to the acquisition process and the differences that are caused due to the existence of red- 
eye in the image. This problem is well known to one familiar in the art of statistical pattern 
matching and scene analysis and image recognition. An example of such an application 
taking into account differences due to resolution is described in US 5933546. 

If a region in the aligned preview image P"(x,y) was red and the equivalent region is red in 
the image r(x,y), 596-6, that region will be eliminated from F(x,y) as a red-eye artefact, 596- 
9, and the corresponding region will be eliminated as a red-eye artefact from the original full 
resolution image I(x,y). Otherwise, the region will continue to remain suspected as red-eye, 
596-8. The process will continue, 596-3, for all suspected regions. 

The comparison of the regions for a color value is done as a global operation on the entire 
region, and the answer to the question of whether a region is red or not is made statistically 
for the entire region and not pixel by pixel, i.e. it does not depend on the value of any 
particular individual pixel. Such approach will account for inconsistencies on the pixel level 
that may be resolved statistically when analyzing a larger collection of pixels consisting of a 
region. For example, some regions of the eye may not be fully red, or display other artefacts 
such as a glint of high luminance. Other example for the need of a global statistical operation 
is the presence of noise in the image. Techniques are known in the art for such global 
comparison. 

Based on the information above, the regions finally identified as red-eye artefacts can be 
modified, 98, to eliminate the red-eye from the original full resolution image I(x,y). The 
modification can be done using any one of numerous available techniques such as luminance 
reduction, chrominance reduction, or subtraction of the artefact, as described in US Published 
Patent Application 2002/0150306 (Baron). 

Figure 5-d describes the preparation of regions suspected of red-eye for comparison as 
described in Figure 5-a, block 596-4. As discussed above, a simple pixel level comparison 
may not be enough to determine whether the region is not of red-eye nature. The process of 
preparation may include a combination of several components such as creating color balance 
between the regions of the preview image and the final image, 1510, analyzing the texture, or 
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differences in high frequency patterns between the two regions that may have occurred due to 
the change in resolution, 1520, and comparing the edges between the two regions, 1530, 
where the differences may have occurred due to change in exposure, color balance, resolution 
or alignment, and in particular sub pixel alignment. The color balance step 1510 comprises 
marking each red-eye region in I'(x,y) and the corresponding region in P"(x,y), steps 1512 
and 1514, determining the difference in color balance between the region in I'(x,y) 
surrounding, but not including, the suspected red-eye region and the corresponding region of 
P"(x,y), step 1516, and transforming the entire region, including the suspected red-eye 
region, based on the color balance difference so determined, step 1518. 

As an alternative embodiment of this invention, the preview image can be used as part of the 
pixel locator stage 92, as illustrated in Figure 5-b, rather than as part of the falsing analyzer 
96. In Fig. 5-b, blocks 510, 520, 522, 530, 532, 534, 540, 94 and 98 are identical to those in 
Figure 5-a. According to this embodiment, the use of the preview image in order to detect 
red-eye artefacts is implemented as part of the red-eye identification process, otherwise 
described as the pixel locator 92 in Figure 1 but here identified as Pixel Analyser and Region 
Segmenter 592. 

After the suspected red-eye regions are identified, the process continues via the shape 
analysis 94, false detection elimination 96 and correction 98 as described in Figure 1. In this 
case, the falsing detector 96 may be performed according to DeLuca. 

According to this embodiment, after the alignment step 540 the following steps 592-la and 
592-lb analyse both images I'(x,y) and P"(x,y) for the presence of pixels having a colour 
indicative of red-eye (592-la), for example in the manner of DeLuca, and then identifies 
clusters of contiguous red pixels so detected (592- lb). This is known as segmentation and is 
more fully described in US Pat. Appn. 2002/0176623. 

Now, each region (cluster) with red content in the acquired image I'(x,y), step 592-2, is 
compared with the corresponding region in the aligned preview image P"(x,y). The regions 
will need to be prepared, 592-4, as previously described in relation to block 596-4 of Figure 
5-a. If the regions are red in both cases, 592-6 = YES, the region will not be marked as red- 
eye, no action will be taken and the process will continue to the next suspected region, 592-3. 
If the region is red in the acquired image T(x,y) while the corresponding region is not red in 
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the preview image P"(x,y), 592-6 = NO, then the region will be marked as suspected red-eye, 
592-8. 

Fig. 6-a shows a modification of the embodiment of Fig. 5-b in which Step 540 (Align 
Images) has been divided into two steps, Step 541 (If Possible Globally Align Images) and 
Step 592-3 (If Required Locally Align Images). Step 541 corresponds to Steps 542 and 544 
of Fig. 5-c. However, if a global alignment is not possible or practical, the local alignment is 
deferred until after red pixel identification and clustering has been performed, since the 
presence of such clusters in the two images I'(x,y) and P'(x,y) will assist in the local 
alignment. Fig. 6-b shows a similar modification applied to Fig. 5-a. 

In the embodiments of the invention, in the comparison stages, 592-6 and 596-6 the pixel 
values do not necessarily have to be compared with red but may alternatively or additionally 
be compared with other values such as yellow, white, pink, brown or other colour indicative 
of a red-eye phenomenon, or to a range of values, to accommodate other flash related eye 
artefacts that are not purely red. Due to the fact that the eye surface is retro-reflective (due to 
the smoothness created by the tears, and the spherical shape of the eyeball), the technique as 
described in this specification can assist in the detection of the eyes in an image. Such 
existence of an eye can be found by comparison of the spectral reflection of the flash in the 
eye with the same region where no flash was used, and thus without spectral reflection. This 
comparison may assist in locating eyes in general and not just eyes with red-eye artefacts. 
This process may be implemented by finding the change of small specular reflections that 
occur in the eye region when flash illumination is used such as described in WO 03/026278 
(Jarman). The specular reflections may be used as another indication of suspected regions as 
defined in blocks 592-2 and 596-2 by comparing the specular reflection of the flash image 
with no specular reflection of the preview image. 

Alternatively to a binary decision of adding or eliminating a region, 596-8 and 596-9, in the 
case of a continuous probability for each region, the process will be revised from a binary 
decision to changing a probability decision. The quantitative determination of such change in 
probability may be decided based on analysis of the confidence level of the comparison 592-4 
and 596-4. 
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The preferred embodiments described above may be modified by adding or changing 
operations, steps and/or components in many ways to produce advantageous alternative 
embodiments. For example, the reference image can be a post- view image rather than a 
preview image, i.e. an image taken without flash immediately after the flash picture is taken. 

A red-eye correction procedure may begin as described by block 92 with detecting a human 
face in a digital image and, based on this detection, finding the eyes in the face (see, e.g., U.S. 
Pat. No. 6,252,976, U.S. Publ. Pat. App. No. 2003/0044070 and U.S. Pat. No. 6,278,491). 
This procedure may also be used for creating the regional alignment 546 and color balance 
1510. 

A range of alternative techniques may be employed to detect and verify the existence of red- 
eye defects in an image (see, e.g., U.S. Publ. Pat. Apps. No. 2003/0044177 and 
2003/0044178). These techniques may be incorporated into the pixel locator, shape analyzer, 
falsing analyzer and pixel modifier corresponding to blocks 92, 94, 96 and 98. A camera 
may include software or firmware for automatically detecting a red-eye image using a variety 
of image characteristics such as image brightness, contrast, the presence of human skin and 
related colors. The analysis of these image characteristics may be utilized, based on certain 
pre-determined statistical thresholds, to decide if red-eye defects exist and if a flash was used 
to take the original image. 

The preferred embodiments described herein may involve expanded digital acquisition 
technology that inherently involves digital cameras, but that may be integrated with other 
devices such as cell-phones equipped with an acquisition component, toy cameras etc. The 
digital camera or other image acquisition device of the preferred embodiment has the 
capability to record not only image data, but also additional data referred to as meta-data. 
The file header of an image file, such as JPEG, TIFF, JPEG-2000, etc., may include capture 
information including the preview image, for processing and red-eye detection at a later post 
processing stage, which may be performed in the acquisition device or in a separate device 
such as a personal computer. The preferred embodiments described herein serve to improve 
the detection of red-eyes in images, while eliminating or reducing the occurrence of false 
positives, and to improve the correction of the detected artefacts. 
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While an exemplary drawing and specific embodiments of the present invention have been 
described and illustrated, it is to be understood that that the scope of the present invention is 
not to be limited to the particular embodiments discussed. Thus, the embodiments shall be 
regarded as illustrative rather than restrictive, and it should be understood that variations may 
be made in those embodiments by workers skilled in the arts without departing from the 
scope of the present invention, as set forth in the claims below and structural and functional 
equivalents thereof. 

In addition, in methods that may be performed according to preferred embodiments herein 
and that may have been described above, the operations have been described in selected 
typographical sequences. However, the sequences have been selected and so ordered for 
typographical convenience and are not intended to imply any particular order for performing 
the operations, unless expressly set forth or understood by those skilled in the art being 
necessary. 

Thus, the preferred embodiments described herein provide an improved method and 
apparatus for detecting red-eye phenomenon within images taken by a digital camera having 
a flash while eliminating or reducing the occurrence of false positives by using preview 
information. 
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Claims 

1. A digital image acquisition system having no photographic film, comprising a 
portable apparatus for capturing digital images, a flash unit for providing illumination during 
image capture, and a red-eye filter for detecting a region within a captured image indicative 
of a red-eye phenomenon, said detection being based upon a comparison of said captured 
image and a reference image of nominally the same scene taken without flash. 

2. A system according to claim 1, wherein the reference image is a preview image of 
lower pixel resolution than the captured image, the filter including means for matching the 
pixel resolutions of the captured and reference images by at least one of up-sampling the 
preview image and sub-sampling the captured image. 

3. A system according to claim 1 or 2, the filter further including means for aligning at 
least portions of the captured image and reference image prior to said comparison. 

4. A system according to claim 1, 2 or 3, wherein the filter detects said region indicative 
of a red-eye phenomenon by identifying a region in the captured image at least having a 
colour indicative of a red-eye phenomenon and comparing said identified region with the 
corresponding region in the reference image, the filter further designating said region as 
indicative of a red-eye phenomenon if said corresponding region does not have a colour 
indicative of a red-eye phenomenon. 

5. A system according to claim 4, wherein the decision as to whether a region has a 
colour indicative of a red-eye phenomenon is determined on a statistical basis as a global 
operation on the entire region. 

6. A system according to claim 4 or 5, wherein the filter also includes a shape analyser 
to identify a region in the captured image having both a shape and colour indicative of a red- 
eye phenomenon for subsequent comparison with the corresponding region in the reference 
image. 
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7. A system according to claim 4 or 5, wherein the filter also includes a shape analyser 
to determine subsequent to said comparison whether a region designated as indicative of a 
red-eye phenomenon has a shape indicative of a red-eye phenomenon. 

8. A system according to any preceding claim, wherein said digital image acquisition 
system is a digital camera. 

9. A system according to any one of claims 1 to 7, wherein said digital image acquisition 
system is a camera is a combination of a digital camera and an external processing device. 

10. A system as claimed in claim 9 wherein said red-eye filter is located in said external 
processing device. 

11. A system according to any preceding claim, further including a pixel modifier for 
modifying the colour of the pixels within a region indicative of a red-eye phenomenon. 

12. A digital image acquisition system having no photographic film, comprising a 
portable apparatus for capturing digital images, a flash unit for providing illumination during 
image capture, and a red-eye filter for detecting red-eye phenomenon in a captured image 
based upon a comparison of said captured image and a reference image of nominally the 
same scene taken without flash, wherein the reference image is a preview image of lower 
pixel resolution than the captured image, the filter including means for matching the pixel 
resolutions of the captured and reference images by at least one of up-sampling the preview 
image and sub-sampling the captured image. 

13. A system according to claim 12, the filter further including means for aligning at least 
portions of the captured image and reference image prior to said comparison. 

14. A system according to claim 12 or 13, wherein the filter detects said red-eye 
phenomenon by identifying a region in the captured image having a colour indicative of a 
red-eye phenomenon and comparing said identified region with the corresponding region in 
the reference image, the filter further designating said region as indicative of a red-eye 
phenomenon if said corresponding region does not have a colour indicative of a red-eye 
phenomenon. 
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15. A system according to claim 14, wherein the decision as to whether a region has a 
colour indicative of a red-eye phenomenon is determined on a statistical basis as a global 
operation on the entire region. 

16. A digital image acquisition system having no photographic film, comprising a 
portable apparatus for capturing digital images, a flash unit for providing illumination during 
image capture, and a red-eye filter for detecting red-eye phenomenon in a captured image 
based upon a comparison of said captured image and a reference image of nominally the 
same scene taken without flash, the filter further including means for aligning at least 
portions of the captured image and reference image prior to said comparison. 

17. A system according to claim 16, wherein the filter detects said red-eye phenomenon 
by identifying a region in the captured image having a colour indicative of a red-eye 
phenomenon and comparing said identified region with the corresponding region in the 
reference image, the filter further designating said region as indicative of a red-eye 
phenomenon if said corresponding region does not have a colour indicative of a red-eye 
phenomenon. 

18. A system according to claim 17, wherein the decision as to whether a region has a 
colour indicative of a red-eye phenomenon is determined on a statistical basis as a global 
operation on the entire region. 

19. A digital image acquisition system having no photographic film, comprising a 
portable apparatus for capturing digital images, a flash unit for providing illumination during 
image capture, and a red-eye filter for detecting red-eye phenomenon in a captured image 
based upon a comparison of said captured image and a reference image of nominally the 
same scene taken without flash, wherein the filter detects said red-eye phenomenon by 
identifying a region in the captured image having a colour indicative of a red-eye 
phenomenon and comparing said identified region with the corresponding region in the 
reference image, the filter further designating said region as indicative of a red-eye 
phenomenon if said corresponding region does not have a colour indicative of a red-eye 
phenomenon. 
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20. A system according to claim 19, wherein the decision as to whether a region has a 
colour indicative of a red-eye phenomenon is determined on a statistical basis as a global 
operation on the entire region. 
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